Asthma and PM2.5 Concentration Maps

The following maps show the concentration of asthma and PM2.5 prevalence in Bay Area Counties. The data used to create the maps comes from CalEnviroScreen 4.0, released on October 20, 2021. The maps are at the level of Census Tracts.

(I had to narrow my scope to only Bay Area Counties rather than conduct a statewide analysis because otherwise my computer would not knit the file and allow me to upload it to Github.)

Asthma Data

The data for the map showing the prevalence of Asthma is the rate of Emergency Department visits for asthma per 10,000 people per year, averaged over 2015-2017. Thus, the data is only up to date until 2017 and may reflect numbers that are no longer accurate.

PM2.5 Data

The PM2.5 map contains data on the fine particle pollution throughout the state. The data represents the annual mean concentration of PM2.5 (the weighted average of measured monitor concentrations and satellite observations, µg/m3), over three years, from 2015 to 2017. Thus, the data may also only be accurate up until 2017 and may be outdated.

Discusssion of Maps

Both maps show disproportionately high prevalence of Asthma and PM2.5 certain urban areas, in particular in the East Bay and in the Eastern part of the Bay Area counties. However, an interesting thing to note is that although it seemed PM2.5 levels were high throughout many urban areas, Asthma levels were not always distributed in the same way. Overall, the levels of prevalence did seem to be at least slightly correlated at first glance, so I will see if this assumption is correct with a more rigorous examination.

Asthma and PM2.5 Scatter Plot

Below is a scatter plot with PM2.5 on the x-axis and Asthma on the y-axis, with a best-fit line.

The line does not appear to fit the data well, since the data is not normally distributed. It seems as though points is clustered in the bottom center of the plot, but since there are some very high levels of asthma, the distribution is off. The best fit line currently shows a slight positive correlation.

Summary of Linear Regression Analysis

## 
## Call:
## lm(formula = Asthma ~ PM2.5, data = ces4_map)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -50.424 -21.485  -6.539  13.432 193.479 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  34.4917     1.6229   21.25   <2e-16 ***
## PM2.5         1.7228     0.1564   11.02   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 30.34 on 8022 degrees of freedom
## Multiple R-squared:  0.01491,    Adjusted R-squared:  0.01479 
## F-statistic: 121.4 on 1 and 8022 DF,  p-value: < 2.2e-16

Discussion

The best fit line does not seem to represent the data very well. The median residual is not centered around zero, is not very close to zero, and does not have symmetrical distribution. The standard error for the slope coefficient is also relatively high, at .156.

Still we can glean some information from this plot. The linear regression analysis suggest the following. An increase of 1 µg/m3 PM2.5 per year on average is associated with an increase of 1.7228 ED visits per year per 10,000 people for Asthma. 1.49% of the variation in Asthma is explained by the variation in PM2.5.